Paper Template for Speech Prosody 2002
نویسندگان
چکیده
This paper presents a study on the control of fundamental frequency (F0) in Cantonese text-to-speech (TTS) systems. The surface F0 contour of an utterance is considered as the combination of tone-related local components and phrase-level long-term variation. A novel method of F0 normalization has been developed to effectively separate them. Statistical analysis is performed for the phrase curves and the tone contours extracted from a large speech corpus, and the results are summarized into regular patterns. These patterns are used as the basic templates in a non-parametric F0 model, from which utterance-level F0 contours can be generated. Perceptual test shows the naturalness of speech naturalness is significantly improved by the new F0 model. The MOS increases by 0.65 over a five-point scale.
منابع مشابه
A Novel Prosody Adaptation Method for Mandarin Concatenation- Based Text-to-speech System
The paper presents a prosody adaptation method which is able to adapt the prosody model of text to speech (TTS) to a new style with a small training corpus. Unlike the conventional prosody mapping between two parallel prosody features, the paper tries to integrate the prosody conversion into the prosody generation model of TTS. In the paper, we use a template based prosody model which consists ...
متن کاملProsody generation in Chinese synthesis using the template of quantified prosodic unit and base intonation contour
This paper presents a prosody generation method for Chinese mandarin using the template of quantified prosodic unit and base intonation contour. This method uses the prosodic feature picked-up from the syllables in the prosody words by rule as the base unit, and integrates the prosody rules in the prosody words of Chinese mandarin and base intonation contour to achieve the prosody contours with...
متن کاملIntegrating rule and template-based approaches for emotional Malay speech synthesis
The manipulation of prosody, including pitch, duration and intensity, is one of the leading approaches in synthesizing emotion. This paper reports work on the development of a Malay Emotional synthesizer capable of expressing four basic emotions, namely happiness, anger, sadness and fear for any form of text input with various intonation patterns using the prosody manipulation principle. The sy...
متن کاملAlfaNum System for Speech Synthesis in Serbian Language
This paper presents some basic criteria for conception of a concatenative text-to-speech synthesizer in Serbian language. The paper describes the prosody generator which was used and reflects upon several peculiarities of Serbian language which led to its adoption. Within the paper, the results of an experiment showing the influence of naturalsounding prosody on human speech recognition are dis...
متن کاملTemplate-Based Automatic Speech Recognition Meets Prosody
In this paper, we use prosodic information to improve the accuracy of our template-based automatic speech recognizer. Prosodic information is harvested adopting a data-driven approach. A number of prosodic features is extracted, then combined into major groups, and finally studied separately and together. All acoustic evidence, both segmental and suprasegmental, is modelled non-parametrically. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004